PI: Dr. Thomas Girke, professor at Department of Botany and Plant Sciences, University of California, Riverside (thomas.girke@ucr.edu)
Maintainer: Jianhai Zhang
spatialHeatmap PackageThe spatialHeatmap package provides functionalities for visualizing cell-, tissue- and organ-specific data of biological assays by coloring the corresponding spatial features defined in anatomical images according to a numeric color key. The color scheme used to represent the assay values can be customized by the user. This core functionality is called a spatial heatmap plot. It is enhanced with nearest neighbor visualization tools for groups of measured items (e.g. gene modules) sharing related abundance profiles, including matrix heatmaps combined with hierarchical clustering dendrograms and network representations.
Related software tools for biological applications in this field are largely based on pure web applications (Winter et al. 2007; Waese et al. 2017) or local tools (Maag 2018; Muschelli, Sweeney, and Crainiceanu 2014) that typically lack customization functionalities. These restrictions limit users to utilizing pre-existing expression data and/or fixed sets of anatomical image collections. To close this gap for biological use cases, we have developed spatialHeatmap as a generic R/Bioconductor package for plotting quantitative values onto any type of spatially mapped images in a programmable environment and/or in an intuitive to use graphical user interface (GUI) application. For details, refer to the package vignette.
To plot spatial heatmaps, a pair of formatted data and aSVG (see aSVG below) image are required. The latter can come from the aSVG repository or created by users. If the target aSVG image is not available in this repository, users should make a custom aSVG image. This tutorial explains the detailed process of making aSVG images. To reproduce the results in this tutorial, all the used files are available to download.
This tutorial first introduces the aSVG repository. Second, three ways of making SVG shapes in Inkscape are presented. Third, the requirements on data format are explained. The spatialHeatmap package is able to work with EBI aSVGs directly. However, for the purpose of plotting spatial heatmaps there are unnecessary format requirements on these aSVGs, thus in the fourth section the simplified aSVG format is presented. The fifth section illustrates conversion from the simplified aSVG format to the advanced EBI aSVG format. If users make custom aSVGs only for their own use, the simplified format is recommended, since it is easier and time-saving. If they want to share with other EBI users they should follow the advanced format. Lastly, spatial heatmaps are demonstrated on the aSVG created in this tutorial.
The spatialHeatmap package should be installed from an R (version \(\ge\) 3.6) session with the BiocManager::install command.
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("spatialHeatmap")
Next, the packages required for running the sample code in this vignette need to be loaded.
library(spatialHeatmap); library(SummarizedExperiment); library(GEOquery)
The following lists the vignette(s) of this package in an HTML browser. Clicking the corresponding name will open this vignette.
browseVignettes('spatialHeatmap')
To assign colors to specific features in spatial heatmaps, annotated SVG (aSVG) files are used where the shapes of interest are labeled according to certain conventions so that they can be addressed and colored programmatically. An aSVG repository, that can be used by spatialHeatmap directly, has been generated by the EBI Gene Expression Group. It contains annatomical aSVG images from different species. These SVG images are also used by the Expression Atlas database. In addition, the spatialHeatmap has its own repository called spatialHeatmap aSVG Repository, where some aSVG files developed in this project are already deposited (e.g. Figure 1).
If users cannot find a target aSVG in the two repositories, this step-by-step SVG tutorial for creating custom aSVG images is recommended. The BAR eFP browser at University of Toronto contains many anotomical images, and these images are good templates for making custom aSVGs.
We will add more aSVGs to our repository in the future and users are welcome to deposit their own aSVGs there to share with other spatialHeatmap users.
To make SVG images, a PNG image with defined tissues and the SVG editor Inkscape are required. The image editor GIMP can be used if the tissue outlines in the PNG image are clear. Inkscape is used to draw the SVG image with the PNG image as a template, and annotate the SVG image in accordance with the data. The values in data are used to color different tissues in spatial heatmaps. GIMP could be used to automatically extract shapes for the SVG image.
There are 3 different options to make SVG images: Draw Over Template Shapes, Use Regular Shapes, Use GIMP. If tissues in the template image have unclear outlines, the first 2 options have to be used, as GIMP is applicable to tissues with clear outlines.
Select ‘Fill and Stroke…’ under ‘Object’ tab on the top. On the right panel ‘Fill and Stroke (Shift+Ctrl+F)’, set ‘Stroke style’ 3.000 px and press ‘Enter’ key.
Press ‘+’ key to zoom in and select a shape to start. Click at differencet corners of the shape to draw an outline. At last, click at the first corner to seal the outline.
If the new shape is filled with a color, click ‘No paint’ under the ‘Fill’ tab on the panel ‘Fill and Stroke (Shift+Ctrl+F)’. Then a new sealed transparent shape is completed.
Select ‘Edit paths by nodes (F2)’ on the left tool bar, and draw a rectangle over the new shape. Select ‘Make selected nodes corner’ on the top.
Drag nodes and edges to align the new shape with template shape. On the fill and stroke panel, under ‘Fill’ tab, select ‘Flat color’ and adjust the color scales to label the new shape. Then the first shape is made successfully.
If the template shapes are similar to regular shapes such as rectangles, circles. The regular shapes can be used to make new shapes.
Select ‘Create rectangles and squares (F4)’ on the left, and draw a rectangle over a shape template. Convert this object to path by selecting ‘Object to Path’ under ‘Path’ tab on the top.
Click ‘No paint’ under the ‘Fill’ tab on the fill and stroke panel to make the rectangle transparent. Rotate the rectangle. Select ‘Edit paths by nodes (F2)’ on the left tool bar. If necessary, add a node by double-clicking on an edge. Drag nodes and edges to align the rectangle with the underlying shape template.
Select ‘Edit paths by nodes (F2)’ on the left tool bar, and draw a rectangle over the new shape. Select ‘Make selected nodes corner’ on the top.
Drag the handles at nodes to adjust edges for fine alignment with the template shape. On the fill and stroke panel select ‘Flat color’ under ‘Fill’ tab to color this new shape. Then the new shape is successfully made.
If shapes in the template PNG image have clear outlines, the SVG image can be extracted with GIMP, since unclear outlines would lead to messay SVG shapes.
Open the PNG template in GIMP, and open ‘Paths’ panel. Right click and select ‘By Color’.
Now the shapes can be selected by colors. For exmaple, clicking on a whilte shape selects all shapes in white. Right click, select ‘To Path’, then all the white shapes are extracted to the ‘Paths’ panel. Similarly, extract the yellow shapes.
Click in front of each extracted shapes to show the ‘eye’ icon. Mouse over the extracted shapes, right click, select ‘Merge Visible Paths’. After merged, export the paths as an SVG image (root_gimp.svg). Next, edit the exported SVG image in Inkscape.
The exported SVG image root_gimp.svg is accessible here (hover over the image, right click, and select ‘Save image as…’).
Open the exported SVG image in Inkscape. Under ‘Object’ tab at the top, select ‘Fill and Stroke…’. First click the image, then select ‘Flat color’ under ‘Fill’ tab. Adjust color scales to fill the image.
All the paths in SVG image generated in GIMP are combined as a whole. In order to separate the paths, first click the image then click ‘Break Apart’ under the top ‘Path’ tab. Now the paths/shapes are separated, but the outlines of shapes are not stroked. Thus use ‘Ctrl+A’ to select all shapes, on the fill and stroke panel select ‘Flat color’ under ‘Stroke paint’ tab, and set a number under ‘Stroke style’ tab (e.g. 1.333 px).
Click the white area to unselect the whole image. Press ‘+’ key to zoom in, try to move different shapes, and delete those unnecessary by pressing ‘Delete’ key.
Use ‘Ctrl+A’ to select all shapes. Click ‘No paint’ under ‘Fill’ tab on the fill and stroke panel. Click the white area to unselect the whole image. The blank SVG image is ready to format with the data, which is downloadable here.
The bridges between data and aSVG are the sample features. Only features having matching counterparts between data and aSVG are colored in the spatial heatmaps. Therefore, the formatting process of SVG image is in accordance with the data formatting. The accepted data classes include vector, data frame, or SummarizedExperiment (SE) (Morgan et al. 2018). Formatting the data is essentially to define samples and/or conditions. In the following, data formatting is explained on SE, since this data class is widely used in biological omics analysis. Details on vector and data frame are presented in Supplement.
SE applies to data involving many samples and conditions. The data matrix with rows and columns being genes and sample/conditions respectively is stored in the assay slot. The data formatting is essentially to make a targets file. It is a data frame and usually contains at least 2 columns defining replicates of samples and conditions respectively. The targets file is in the colData slot. In the rowData slot, a data frame of annotations for rows in assay is optionally added.
The example data GSE14502 is from GEO. It is a microarray analysis on Arabidopsis thaliana root/shoot tissues under control and hypoxia (Mustroph et al. 2009), and is downloaded through GEOquery (Davis and Meltzer 2007).
Access the GEO dataset GSE14502 and convert it to SummarizedExperiment.
gset <- getGEO("GSE14502", GSEMatrix=TRUE, getGPL=TRUE)[[1]]
se.arab <- as(gset, "SummarizedExperiment")
Use gene symbols to replace probes.
rownames(se.arab) <- make.names(rowData(se.arab)[, 'Gene.Symbol'])
A slice of the data matrix in assay slot.
assay(se.arab)[1:3, c(25:29, 36:39)]
## GSM362192 GSM362193 GSM362194 GSM362195 GSM362196 GSM362203 GSM362204
## ORF25 4.759222 5.108704 5.017523 5.183043 4.956723 4.799543 5.092339
## NAD4L 4.838742 4.964549 4.877558 5.210359 5.316448 4.980785 5.418198
## ArthMp059 8.365082 5.655345 5.580664 5.990833 7.483474 6.310126 7.499970
## GSM362205 GSM362206
## ORF25 4.823938 5.209876
## NAD4L 4.985976 5.376374
## ArthMp059 5.952638 6.255218
A slice of the experiment design, which is stored in colData slot.
colData(se.arab)[c(25:29, 36:39), 1:4]
## DataFrame with 9 rows and 4 columns
## title geo_accession status
## <character> <character> <character>
## GSM362192 root_control_pGL2_rep1 GSM362192 Public on Oct 12 2009
## GSM362193 root_control_pGL2_rep2 GSM362193 Public on Oct 12 2009
## GSM362194 root_control_pGL2_rep3 GSM362194 Public on Oct 12 2009
## GSM362195 root_hypoxia_pGL2_rep1 GSM362195 Public on Oct 12 2009
## GSM362196 root_hypoxia_pGL2_rep2 GSM362196 Public on Oct 12 2009
## GSM362203 root_control_pCO2_rep1 GSM362203 Public on Oct 12 2009
## GSM362204 root_control_pCO2_rep2 GSM362204 Public on Oct 12 2009
## GSM362205 root_hypoxia_pCO2_rep1 GSM362205 Public on Oct 12 2009
## GSM362206 root_hypoxia_pCO2_rep2 GSM362206 Public on Oct 12 2009
## submission_date
## <character>
## GSM362192 Jan 21 2009
## GSM362193 Jan 21 2009
## GSM362194 Jan 21 2009
## GSM362195 Jan 21 2009
## GSM362196 Jan 21 2009
## GSM362203 Jan 21 2009
## GSM362204 Jan 21 2009
## GSM362205 Jan 21 2009
## GSM362206 Jan 21 2009
The title column includes ‘samples’ and ‘conditions’, so it is used to make the targets file based on the following requirements.
It is a data frame and usually has at least one column of samples and one column of conditions. The rows correspond with columns in assay slot. If the condition column is not defined, the samples are assumped under same condition.
The sample column specifies sample replicates. It is crucial that replicate names of the same sample must be identical. Otherwise, they are treated as different samples. E.g. ‘root_pGL2’ and ‘root_pCO2’ in Table 1.
The sample identifiers of interest must be identical with features of interest in aSVG respectively. It means even a dot, undescore, space, etc can make a difference and lead to target features not colored in spatial heatmaps. Since double underscore (__) is a reserved separator in spatialHeatmap, it cannot be used in sample or condition identifiers.
The condition column has the same requirement with the sample column. E.g. ‘control’ and ‘hypoxia’ in Table 1.
The completed targets file is packaged in spatialHeatmap, and is also downloadable here (click the file, click “Raw”, right click, and select “Save as…”). Selected rows are shown in Table 1.
tar.arab <- system.file('extdata/shinyApp/example/target_arab.txt', package='spatialHeatmap')
target.arab <- read.table(tar.arab, header=TRUE, row.names=1, sep='\t')
target.arab[c(25:29, 36:39), ]
| samples | conditions | |
|---|---|---|
| root_control_pGL2_rep1 | root_pGL2 | control |
| root_control_pGL2_rep2 | root_pGL2 | control |
| root_control_pGL2_rep3 | root_pGL2 | control |
| root_hypoxia_pGL2_rep1 | root_pGL2 | hypoxia |
| root_hypoxia_pGL2_rep2 | root_pGL2 | hypoxia |
| root_control_pCO2_rep1 | root_pCO2 | control |
| root_control_pCO2_rep2 | root_pCO2 | control |
| root_hypoxia_pCO2_rep1 | root_pCO2 | hypoxia |
| root_hypoxia_pCO2_rep2 | root_pCO2 | hypoxia |
Use the targets file to replace the data frame in colData slot.
As pre-proccesing conventions, gene expressoin profiling data should be normalized, aggregated, and filtered. The dataset GSE14502 is already normalised by RMA (Gautier et al. 2004), so the pro-processing only includes replicate aggregation and filtering.
The data is aggregated based on ‘sample__condition’ replicates internally, and a slice of the result is shown below.
se.aggr.arab <- aggr_rep(data=se.arab, sam.factor='samples', con.factor='conditions', aggr='mean')
assay(se.aggr.arab)[1:3, c(11:12, 16:17)]
## root_pGL2__control root_pGL2__hypoxia root_pCO2__control
## ORF25 4.961816 5.069883 4.945941
## NAD4L 4.893616 5.263403 5.199492
## ArthMp059 6.533697 6.737153 6.905048
## root_pCO2__hypoxia
## ORF25 5.016907
## NAD4L 5.181175
## ArthMp059 6.103928
Genes with expression values larger than 6 in at least 3% of all samples (pOA=c(0.03, 6)), and with coefficient of variance (CV) between 0.30 and 100 (CV=c(0.30, 100)) are retained.
# Filter genes with low variance and low intensity.
se.fil.arab <- filter_data(data=se.aggr.arab, sam.factor='samples', con.factor='conditions', pOA=c(0.03, 6), CV=c(0.30, 100), dir=NULL)
If users make aSVGs only for their own use rather than sharing with other EBI users, the simplified format is recommended, since it is easier and time-saving.
A path represents a shape. If a tissue consists of multiple paths and is expected to be colored in the spatial heatmap, all its paths must be grouped as a whole (labeled by tag g). A group should not include another group, which means all elements in a group should be single paths. However, if a multiple-path tissue is not expected to be colored in the spatial heatmap, there is no need to group them and the paths can keep random ids.
If a tissue is expected to be colored in the spatial heatmaps, its id value must have an identical tissue counterpart in the data/targets file. It means even a difference of dot, space, underscore, uppercase, or lowercase matters. If a tissue is a group, the group id is considered while the inside path ids are useless.
In the end, all the tissues (groups and single paths) must be grouped together as a container group, and this large group must be the last element in the ‘XML Editor’.
In order to be colored in spatial heatmaps, tissues consisting of multiple shapes should be grouped. Take the root_pGL2 as example. Select shapes of this tissue by clicking their edges while pressing ‘Shift’ key. Mouse over any edge of selected shapes, right click and select ‘Group’.
Click ‘Flat color’ under the ‘Fill’ tab on the fill and stroke panel, and fill the grouped shapes with a preferred color.
Under the ‘Edit’ tab on the top, select ‘XML Editor…’. Click the group to select it. On the ‘XML Editor (Shift+Ctrl+X)’ panel, first click the ‘id’ and type in root_pGL2 then click ‘Set’. It is critial that the id ‘root_pGL2’ has identical tissue counterpart in the data. Otherwise, this tissue will not be colored. After that, the first group is done.
It is optional to add an ‘ontology’ attribute to the tissue. Retrieve the ontology id for ‘root atrichoblast epidermis’ (root_pGL2) at Ontology Lookup Service. Click the tissue in the XML Editor, type in ‘ontology’ and the retreived id in front of and below ‘Set’ respectively, and click ‘Set’, then the ‘ontology’ attribute is assigned.
Similarly, group and set id for root_pCO2, root_pSCR, root_pWOL respectively. When group the small vasculature shapes in the center, a shortcut is to draw a rectangle over them to select all rather than clicking each individually. Note if a tissue sample contains only one shape, there is no need to group it, but the id should be identical with corresponding sample in the data in order to be colored in spatial heatmaps.
Brown, blue, orange, purple labels root_pGL2, root_pCO2, root_pSCR, root_pWOL respectively. The blank shapes have random ids and will not be colored in the spatial heatmap.
At last, it is required to group all tissues (groups and independent paths) as a container group. To do so, use ‘Ctrl+A’ to select all, mouse over the selection, right click, and click group. Note it is optional to set a specific id for this container group.
It is crucial that this container group must be the last element on the ‘XML Editor’.
All the paths need to use absolute positions. To do so, click ‘Preferences…’ under ‘Edit’ tab. Go to ‘Input/Output’ → ‘SVG Output’ → ‘Path data’ → ‘Path string format’, and select ‘Absolute’.
This setting only affects newly created paths. To trigger the effcts on existing paths, click ‘Select All in All Layers’ under the ‘Edit’ tab, and use the arrow key to nudge the selection, e.g. one step forward and one step backward. Then all the paths are rewritten and absolutely positioned. The absolut position is indicated by the capital letters in the ‘d’ attribute of a path.
Next, make sure the canvas are resized as shown in Section Resize Canvas.
By now, the simplified aSVG image is done. Save it as ‘root_cross_simple.svg’, which is downloadable here. Note: the aSVG file name ends with ‘.svg’ and the front should only consist of letters, digits, dots, or underscores. E.g. ‘root_cross_simple(1).svg’ is not acceptable.
To work with the Expression Atlas Anatomogram, the advanced aSVG format is required, and the specific requirements are listed at the bottom of this page. The following sections illustrate the transition from the simplified format to advanced format.
Download the EBI SVG template EBI_template.svg, which is made from the Expression Atlas Anatomogram and open it in Inkscape. It contains 2 layers LAYER_OUTLINE and LAYER_EFO. The former is expected to store organism outline shapes while the latter to store tissue shapes. As a template, both layers are empty except that LAYER_OUTLINE contains a green icon, which links to Expression Atlas Licence.
Copy height, width, viewBox values in top <svg ...> tag from root_cross_simple.svg to EBI_template.svg respectively.
The green icon might be shrunk. To resize the icon, select it, click the lock on the top toolbar, which maintains the aspect ratio, and increase the height (H) or width (W). Move the icon to the bottom left corner.
In root_cross_simple.svg, ungroup (Ctrl+Shift+G) the container group, select all shapes, and copy all.
In EBI_template.svg, open layer panel (Ctrl+Shift+L) and make sure the ‘unlock’ icon is in front of ‘EFO’ and ‘Outline’. Click the ‘EFO’ layer, and paste all tissue shapes from root_cross_simple.svg. Then these shapes are inside the ‘EFO’ layer.
The tissues in ‘EFO’ layer should be annotated, which includes ontology ids and tissue identifiers. Take the root_pSCR (root endodermis) tissue for example. Click the tissue in ‘XML Editor’, and click ‘New element node’ on the top. Type in ‘svg:title’ and click ‘Create’. Then a title node is created at the bottom of root_pSCR.
Click the title node and click ‘New text node’ at the top. Then an empty text node is created inside the title node. Click the text node and type in ‘root_pSCR’ on the right.
Set title id as root_pSCR, then the title is done.
Look up for ‘root endodermis’ that root_pSCR stands for in Ontology Lookup Service, and set the root_pSCR group id as PO:0005059. Then the annotation of root_pSCR is done.
Annotate other tissues in the same way.
If there are outline shapes, they should be placed in the ‘LAYER_OUTLINE’ layer. For illustration purpose, a root outline is created.
Click ‘Outline’ in the layer panel, and draw an outline as explained above. Then the outline shape is created in the ‘Outline’ layer.
Make sure all paths have absolute coordinates as shown in Section Absolute Path Position.
To ensure correct height and width, go to ‘Document Properties…’ under ‘File’ tab, and select ‘Custom size’ under ‘Page’ tab, then set ‘Units’ to ‘px’ and click on ‘Resize page to drawing or selection’.
According to EBI guidlines, all tissue shapes (‘EFO’ layer) are expected to have style="fill:none; stroke:none". To do so, click on the ‘EFO’ in layer panel, select all tissue shapes (Ctrl+A), then click ‘No paint’ under both ‘Fill’ and ‘Stroke paint’ in ‘XML Editor’ panel. Then all the tissue shapes are transparent.
The EBI SVG follows the naming scheme ‘<species>[.view].svg’, so this image can be saved as ‘arabidopsis_thaliana.root_cross.svg’, which is downloadable here.
Both ‘root_cross_simple.svg’ and ‘arabidopsis_thaliana.root_cross.svg’ are ready to use for plotting spatial heatmaps. The latter is packaged in spatialHeatmap and is accessed below.
svg.root <- system.file('extdata/shinyApp/example/arabidopsis_thaliana.root_cross.svg', package='spatialHeatmap')
Plot spatial heatmaps on gene HRE2. In Figure 1, it is manifest that gene HRE2 is showing higher expression level in hypoxia than in control, and thus might play an interesting role in hypoxia resistance.
spatial_hm(svg.path=svg.root, data=se.fil.arab, sam.factor='samples', con.factor='conditions', ID=c("HRE2"), height=0.4, legend.nrow=3, legend.r=1.3, legend.key.size=0.3)
## Enrties not mapped: root_total, root_p35S, root_pSHR, root_pSUC2, root_pSultr2.2, root_pPEP, root_pRPL11C, shoot_total, shoot_p35S, shoot_pGL2, shoot_pRBCS, shoot_pSUC2, shoot_pSultr2.2, shoot_pCER5, shoot_pKAT1
Figure 1 Spatial Heatmaps. The matching tissues between data and aSVG are colored in the middle spatial heatmaps. On the right is the legend plot, where matching tissues are labeled.
This section presents details of data formatting on vector and data frame.
The data class vector applies to several numeric values measured for a single item (e.g. gene). If one or more conditions are provided, the samples and conditions should be connected by double undescore, i.e. in the form of ‘sample__condition’. Since ’__‘is a reserved separator, the naming scheme of ’sample’ and ‘condition’ should not use it. If no conditions are provided, all the samples are assumed to have same condition.
Take the samples and conditions in Table 1 for example. The two samples are ‘root_pGL2’ and ‘root_pCO2’ and two conditions are ‘control’ and ‘hypoxia’. Assume the two samples have matching counterparts in the aSVG. Since there are two conditions for each sample, the vector should contain four target values. The following code generates five random values so that the first four are the target values while the last one is from a third assumed sample that has no counterparts in the aSVG.
# Random numeric values.
vec <- sample(x=1:100, size=5)
Name the first 4 values with the scheme ‘sample__condition’, and last with a random name notMapped. Note each value has a unique name.
# Give unique names to random values.
names(vec) <- c('root_pGL2__control', 'root_pGL2__hypoxia', 'root_pCO2__control', 'root_pCO2__hypoxia', 'notMapped')
vec
## root_pGL2__control root_pGL2__hypoxia root_pCO2__control root_pCO2__hypoxia
## 5 56 50 30
## notMapped
## 68
The class data frame applies to more items (e.g. genes) assayed in several samples and/or conditions (e.g. 2 samples under 2 conditions). Columns and rows are samples/conditions and assayed items respectively. Similarly, if one or more conditions are provided, the column names should follow the scheme ‘sample__conditio’. If no conditions are provided, all the samples are assumed to have same condition.
Take the same samples and conditions in the vector case as example. Make a numeric data frame of 20 rows and 5 columns.
# Make a numeric data frame.
df.test <- data.frame(matrix(sample(x=1:1000, size=100), nrow=20))
Name columns with the names in above vector and rows with 20 genes (gene1, gene2, …, gene20).
# Name the columns.
colnames(df.test) <- names(vec)
# Name the rows.
rownames(df.test) <- paste0('gene', 1:20)
# A slice of the data frame.
df.test[1:3, ]
## root_pGL2__control root_pGL2__hypoxia root_pCO2__control
## gene1 313 718 827
## gene2 230 534 849
## gene3 60 509 493
## root_pCO2__hypoxia notMapped
## gene1 458 289
## gene2 150 893
## gene3 888 796
In the downstream interactive network (refer to the package vignette), if users want to have a gene annotation by mousing over a node, a column of gene annotation can be appended to the data frame. For example, the 20 genes are annotated as ann1, ann2, …, ann20.
df.test$ann <- paste0('ann', 1:20)
df.test[1:3, ]
## root_pGL2__control root_pGL2__hypoxia root_pCO2__control
## gene1 313 718 827
## gene2 230 534 849
## gene3 60 509 493
## root_pCO2__hypoxia notMapped ann
## gene1 458 289 ann1
## gene2 150 893 ann2
## gene3 888 796 ann3
sessionInfo()
## R version 4.0.0 (2020-04-24)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] GEOquery_2.56.0 SummarizedExperiment_1.18.1
## [3] DelayedArray_0.14.0 matrixStats_0.56.0
## [5] Biobase_2.48.0 GenomicRanges_1.40.0
## [7] GenomeInfoDb_1.24.0 IRanges_2.22.1
## [9] S4Vectors_0.26.1 BiocGenerics_0.34.0
## [11] spatialHeatmap_0.99.0 knitr_1.28
## [13] nvimcom_0.9-25
##
## loaded via a namespace (and not attached):
## [1] backports_1.1.7 rols_2.16.1 Hmisc_4.4-0
## [4] igraph_1.2.5 lazyeval_0.2.2 shinydashboard_0.7.1
## [7] splines_4.0.0 BiocParallel_1.22.0 ggplot2_3.3.0
## [10] digest_0.6.25 foreach_1.5.0 htmltools_0.4.0
## [13] GO.db_3.11.1 gdata_2.18.0 magrittr_1.5
## [16] checkmate_2.0.0 memoise_1.1.0 cluster_2.1.0
## [19] doParallel_1.0.15 limma_3.44.1 readr_1.3.1
## [22] fastcluster_1.1.25 annotate_1.66.0 prettyunits_1.1.1
## [25] jpeg_0.1-8.1 colorspace_1.4-1 blob_1.2.1
## [28] xfun_0.13 dplyr_0.8.5 crayon_1.3.4
## [31] RCurl_1.98-1.2 jsonlite_1.6.1 genefilter_1.70.0
## [34] impute_1.62.0 survival_3.1-12 iterators_1.0.12
## [37] glue_1.4.1 gtable_0.3.0 zlibbioc_1.34.0
## [40] XVector_0.28.0 scales_1.1.1 DBI_1.1.0
## [43] edgeR_3.30.0 Rcpp_1.0.4.6 viridisLite_0.3.0
## [46] xtable_1.8-4 progress_1.2.2 htmlTable_1.13.3
## [49] gridGraphics_0.5-0 flashClust_1.01-2 foreign_0.8-79
## [52] bit_1.1-15.2 preprocessCore_1.50.0 Formula_1.2-3
## [55] rsvg_2.1 htmlwidgets_1.5.1 httr_1.4.1
## [58] gplots_3.0.3 RColorBrewer_1.1-2 acepack_1.4.1
## [61] ellipsis_0.3.1 farver_2.0.3 pkgconfig_2.0.3
## [64] XML_3.99-0.3 nnet_7.3-14 locfit_1.5-9.4
## [67] dynamicTreeCut_1.63-1 labeling_0.3 ggplotify_0.0.5
## [70] tidyselect_1.1.0 rlang_0.4.6 later_1.0.0
## [73] AnnotationDbi_1.50.0 visNetwork_2.0.9 munsell_0.5.0
## [76] tools_4.0.0 RSQLite_2.2.0 evaluate_0.14
## [79] stringr_1.4.0 fastmap_1.0.1 ggdendro_0.1-20
## [82] yaml_2.2.1 bit64_0.9-7 caTools_1.18.0
## [85] purrr_0.3.4 mime_0.9 xml2_1.3.2
## [88] compiler_4.0.0 rstudioapi_0.11 curl_4.3
## [91] plotly_4.9.2.1 png_0.1-7 tibble_3.0.1
## [94] geneplotter_1.66.0 stringi_1.4.6 highr_0.8
## [97] lattice_0.20-41 Matrix_1.2-18 vctrs_0.3.0
## [100] pillar_1.4.4 lifecycle_0.2.0 BiocManager_1.30.10
## [103] data.table_1.12.8 bitops_1.0-6 grImport_0.9-3
## [106] httpuv_1.5.2 R6_2.4.1 latticeExtra_0.6-29
## [109] promises_1.1.0 KernSmooth_2.23-17 gridExtra_2.3
## [112] codetools_0.2-16 MASS_7.3-51.6 gtools_3.8.2
## [115] assertthat_0.2.1 DESeq2_1.28.1 GenomeInfoDbData_1.2.3
## [118] hms_0.5.3 grid_4.0.0 rpart_4.1-15
## [121] tidyr_1.0.3 rmarkdown_2.1 rvcheck_0.1.8
## [124] shiny_1.4.0.2 WGCNA_1.69 base64enc_0.1-3
Davis, Sean, and Paul Meltzer. 2007. “GEOquery: A Bridge Between the Gene Expression Omnibus (GEO) and BioConductor.” Bioinformatics 14: 1846–7.
Gautier, Laurent, Leslie Cope, Benjamin M. Bolstad, and Rafael A. Irizarry. 2004. “Affy—analysis of Affymetrix GeneChip Data at the Probe Level.” Bioinformatics 20 (3). Oxford, UK: Oxford University Press: 307–15. doi:10.1093/bioinformatics/btg405.
Maag, Jesper L V. 2018. “Gganatogram: An R Package for Modular Visualisation of Anatograms and Tissues Based on Ggplot2.” F1000Res. 7 (September): 1576.
Morgan, Martin, Valerie Obenchain, Jim Hester, and Hervé Pagès. 2018. SummarizedExperiment: SummarizedExperiment Container.
Muschelli, John, Elizabeth Sweeney, and Ciprian Crainiceanu. 2014. “BrainR: Interactive 3 and 4D Images of High Resolution Neuroimage Data.” R J. 6 (1): 41–48.
Mustroph, Angelika, M Eugenia Zanetti, Charles J H Jang, Hans E Holtan, Peter P Repetti, David W Galbraith, Thomas Girke, and Julia Bailey-Serres. 2009. “Profiling Translatomes of Discrete Cell Populations Resolves Altered Cellular Priorities During Hypoxia in Arabidopsis.” Proc Natl Acad Sci U S A 106 (44): 18843–8.
Waese, Jamie, Jim Fan, Asher Pasha, Hans Yu, Geoffrey Fucile, Ruian Shi, Matthew Cumming, et al. 2017. “EPlant: Visualizing and Exploring Multiple Levels of Data for Hypothesis Generation in Plant Biology.” Plant Cell 29 (8): 1806–21.
Winter, Debbie, Ben Vinegar, Hardeep Nahal, Ron Ammar, Greg V Wilson, and Nicholas J Provart. 2007. “An ‘Electronic Fluorescent Pictograph’ Browser for Exploring and Analyzing Large-Scale Biological Data Sets.” PLoS One 2 (8): e718.